A Gossip-Style Failure Detection Service
نویسندگان
چکیده
Failure Detection is valuable for system management, replication, load balancing, and other distributed services. To date, Failure Detection Services scale badly in the number of members that are being monitored. This paper describes a new protocol based on gossiping that does scale well and provides timely detection. We analyze the protocol, and then extend it to discover and leverage the underlying network topology for much improved resource utilization. We then combine it with another protocol, based on broadcast, that is used to handle partition failures.
منابع مشابه
Experimental Evaluation of a Failure Detection Service Based on a Gossip Strategy
Failure detectors were first proposed as an abstraction that makes it possible to solve consensus in asynchronous systems. A failure detector is a distributed oracle that provides information about the state of processes of a distributed system. This work presents a failure detection service based on a gossip strategy. The service was implemented on the JXTA platform. A simulator was also imple...
متن کاملSurvey on Scalable Failure Detectors
Maintaining a timely view of the current system status is essential to the performance and functionality of distributed systems. Failure detectors have long been essential to distributed systems. In this paper, we evaluate two failure detection algorithms specifically aimed at large-scale systems. Both assume fail-stop (non-Byzantine) models but the similarities end there. Dynamo’s failure dete...
متن کاملRevisiting Gossip-style Failure Detection in Wireless Sensor Network
Wireless sensor networks are often operating over inaccessible and inhospitable environment so as to monitor phenomena. Providing dependable monitoring is henceforth especially challenging. As a first step upon that goal, we propose a gossip-style failure detector that provides hints about failures. More precisely, we introduce few gossiping policies involving a random selection of the gossiper...
متن کاملGossip-based Failure Detection and Consensus for Terascale Computing
of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science GOSSIP-BASED FAILURE DETECTION AND CONSENSUS FOR TERASCALE COMPUTING By Rajagopal Subramaniyan May 2003 Chair: Alan D. George Department: Electrical and Computer Engineering One promising avenue of research on failure detection for large systems ...
متن کاملA Failure Detection Service Based on Epidemic Dissemination for Peer-to-Peer Networks
Failure detectors were first proposed as an abstraction that makes it possible to solve consensus in asynchronous systems. A failure detector is a distributed oracle that provides information about the state of processes of a distributed system. This work presents a failure detection service based on a gossip strategy. The service was implemented on the JXTA platform. A simulator was also imple...
متن کامل